<b>stringi</b>: Fast and Portable Character String Processing in <i>R</i>

نویسندگان

چکیده

Effective processing of character strings is required at various stages data analysis pipelines: from cleansing and preparation, through information extraction, to report generation. Pattern searching, string collation sorting, normalization, transliteration, formatting are ubiquitous in text mining, natural language processing, bioinformatics. This paper discusses demonstrates how why stringi, a mature R package for fast portable handling based on ICU (International Components Unicode), should be included each statistician's or scientist's repertoire complement their numerical computing wrangling skills.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pre-orthographic character string processing and parietal cortex: a role for visual attention in reading?

The visual front-end of reading is most often associated with orthographic processing. The left ventral occipito-temporal cortex seems to be preferentially tuned for letter string and word processing. In contrast, little is known of the mechanisms responsible for pre-orthographic processing: the processing of character strings regardless of character type. While the superior parietal lobule has...

متن کامل

Robust and Portable Text Processing

Over the last few months we have begun to adapt our Proteus information extraction system to process reports of terrorist activity. This is being done as part of MUC (Message Understanding Conference) -3, a comparative evaluation of information extraction systems organized by the Naval Ocean Systems Center. These reports are substantially more complex, both syntactically and semantically, than ...

متن کامل

superstring scatterng from dbranes and wess zumino terms in string theory

دو هدف را برای انجام این پایان نامه دنبال می کنیم . نشان می دهیم که بسط دامنه پراکندگی یک میدان راموند-راموند دو میدان پیمانه ای و یک میدان تاکیونی caat بسط سازگاری است که این نکته بیانگر آنست که بسط تاکیونها که بسط انرژی پایین نیست را یافته ایم و این بسط با بسط aatt,ctta,cta,cttt سازگار است با مقایسه با دامنه پراکندگی تعدادی از تصحیحات کنشهای tachyonic dbi, wess-zumino را می یابیم .

15 صفحه اول

DEVELOPMENT IN STRING THEORY

The string theory is a fast moving subject, both physics wise and in the respect of mathematics. In order to keep up with the discipline it is important to move with new ideas which are being stressed. Here I wish to give extracts from new papers of ideas which I have recently found interesting. There are six papers which are involved: I ."Strings formulated directly in 4 dimensions " A. N...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Statistical Software

سال: 2022

ISSN: ['1548-7660']

DOI: https://doi.org/10.18637/jss.v103.i02